An Index for Selecting Mixed Explanatory Variables in the Analysis of Dependence
نویسنده
چکیده
In the analysis of multivariate dependence, two main objectives may be pursued: descriptive and decisional. In the descriptive approach, the selection of explanatory variables is determined by the researcher’s objectives and form the nature of the problem at hand. The decisional approach requires a parsimonious model where only the variables with the highest explanatory power are included. This is done in order to achieve the highest possible rate of correctly classified units. In the case of numerical explanatory variables, the problem has been widely treated in discriminant analysis (Huberty, 1994) and in regression analysis (Cox e Snell, 1989). In the analysis of dependence for contingency tables, two approaches are known in literature: modelling by means of log-linear models and factorial methods. Within the framework of factorial methods, depending on the sampling scheme, we can refer to Baricentric Discriminant Analysis (BDA, Nakache, 1981) with a retrospective sampling, or to Non Symmetrical Correspondence Analysis (NSCA, Lauro e D’Ambra, 1984) with a prospective sampling (Palumbo, 1995), i.e. when no a priori information is available on the distribution of the explanatory variable. In the context of BDA for variables selection, the chi-square criterion is adopted (Nakache et al., 1977). On the other side, in NSCA, its Authors proposed a selection criterion based on the multiple τ index by Gray and Williams. This index, some conditions be given, approximately follows the chi-square distribution. However, the computation of the latter criterion is heavy. This paper aims at developing an alternative criterion, more efficient from a computational point of view, for the variables selection in decisional NSCA.
منابع مشابه
Iranian EFL Teachers’ Conceptions of Research: An Explanatory Mixed Methods Approach
This study reports a mixed methods design investigation into language teachers’ conception of research. The study drew on various sources of data, including teachers’ responses to questionnaires items, qualitative comments in follow-up interviews, and contributions to e-mail inquiries. Results showed that the teachers’ understanding of research is mainly associated with a standard view of resea...
متن کاملTransition Models for Analyzing Longitudinal Data with Bivariate Mixed Ordinal and Nominal Responses
In many longitudinal studies, nominal and ordinal mixed bivariate responses are measured. In these studies, the aim is to investigate the effects of explanatory variables on these time-related responses. A regression analysis for these types of data must allow for the correlation among responses during the time. To analyze such ordinal-nominal responses, using a proposed weighting approach, an ...
متن کاملبهکارگیری متغیرهای پنهان در مدل رگرسیون لجستیک برای حذف اثر همخطی چندگانه در تحلیل برخی عوامل مرتبط با سرطان پستان
Background and Objectives: Logistic regression is one of the most widely used generalized linear models for analysis of the relationships between one or more explanatory variables and a categorical response. Strong correlations among explanatory variables (multicollinearity) reduce the efficiency of model to a considerable degree. In this study we used latent variables to reduce the effects of ...
متن کاملSupplier selection among alternative scenarios by Data envelopment analysis
A considerable problem in competitive trade world is choosing the best supply chain. As a result in much more serious circumstances of competitions looking for the best supplier for manufacturing, for preparing raw material, is very significant. Meantime suppliers have different scenarios to be fulfilled, such as changing selection variables like lead-time, transportation cost and transportatio...
متن کاملBuilding Regression Cost Models for Multidatabase Systems
A major challenge for performing global query optimization in a multidatabase system (MDBS) is the lack of cost models for local database systems at the global level. In this paper we present a statistical procedure based on multiple regression analysis for building cost models for local database systems in an MDBS. Explanatory variables that can be included in a regression model are identiied ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000